Structural bioinformatics prediction of membrane-binding proteins.

نویسندگان

  • Nitin Bhardwaj
  • Robert V Stahelin
  • Robert E Langlois
  • Wonhwa Cho
  • Hui Lu
چکیده

Membrane-binding peripheral proteins play important roles in many biological processes, including cell signaling and membrane trafficking. Unlike integral membrane proteins, these proteins bind the membrane mostly in a reversible manner. Since peripheral proteins do not have canonical transmembrane segments, it is difficult to identify them from their amino acid sequences. As a first step toward genome-scale identification of membrane-binding peripheral proteins, we built a kernel-based machine learning protocol. Key features of known membrane-binding proteins, including electrostatic properties and amino acid composition, were calculated from their amino acid sequences and tertiary structures, which were then incorporated into the support vector machine to perform the classification. A data set of 40 membrane-binding proteins and 230 non-membrane-binding proteins was used to construct and validate the protocol. Cross-validation and holdout evaluation of the protocol showed that the accuracy of the prediction reached up to 93.7% and 91.6%, respectively. The protocol was applied to the prediction of membrane-binding properties of four C2 domains from novel protein kinases C. Although these C2 domains have 50% sequence identity, only one of them was predicted to bind the membrane, which was verified experimentally with surface plasmon resonance analysis. These results suggest that our protocol can be used for predicting membrane-binding properties of a wide variety of modular domains and may be further extended to genome-scale identification of membrane-binding peripheral proteins.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

In silico investigation of lactoferrin protein characterizations for the prediction of anti-microbial properties

Lactoferrin (Lf) is an iron-binding multi-functional glycoprotein which has numerous physiological functions such as iron transportation, anti-microbial activity and immune response. In this study, different in silico approaches were exploited to investigate Lf protein properties in a number of mammalian species. Results showed that the iron-binding site, DNA and RNA-binding sites, signal pepti...

متن کامل

Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function

MOTIVATION Template-based prediction of DNA binding proteins requires not only structural similarity between target and template structures but also prediction of binding affinity between the target and DNA to ensure binding. Here, we propose to predict protein-DNA binding affinity by introducing a new volume-fraction correction to a statistical energy function based on a distance-scaled, finit...

متن کامل

String Kernels and High-Quality Data Set for Improved Prediction of Kinked Helices in α-Helical Membrane Proteins

The reasons for distortions from optimal α-helical geometry are widely unknown, but their influences on structural changes of proteins are significant. Hence, their prediction is a crucial problem in structural bioinformatics. For the particular case of kink prediction, we generated a data set of 132 membrane proteins containing 1014 manually labeled helices and examined the environment of kink...

متن کامل

Prediction of sub-cavity binding preferences using an adaptive physicochemical structure representation

MOTIVATION The ability to predict binding profiles for an arbitrary protein can significantly improve the areas of drug discovery, lead optimization and protein function prediction. At present, there are no successful algorithms capable of predicting binding profiles for novel proteins. Existing methods typically rely on manually curated templates or entire active site comparison. Consequently,...

متن کامل

Machine learning models in protein bioinformatics.

Bioinformatics is a relatively new field concerned with the computational analysis and prediction of properties of biomolecules, DNA, RNA, and proteins, in particular, on a genomic/proteomic scale. Machine learning models play increasingly important roles in development of novel methodologies, summarization, and high-throughput analysis in the bioinformatics field. Advances in the related area,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of molecular biology

دوره 359 2  شماره 

صفحات  -

تاریخ انتشار 2006